Accelerated Multiple Precision Matrix Multiplication using Strassen's Algorithm and Winograd's Variant

نویسنده

  • Tomonori Kouya
چکیده

The Strassen algorithm and Winograd’s variant accelerate matrix multiplication by using fewer arithmetic operations than standard matrix multiplication. Although many papers have been published to accelerate singleas well as double-precision matrix multiplication by using these algorithms, no research to date has been undertaken to accelerate multiple precision matrix multiplication. In this paper, we propose a multiple precision matrix multiplication program for matrices of any size and test its performance. We also reveal special properties of our program through its application to LU decomposition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gemmw: a Portable Level 3 Blas Winograd Variant of Strassen's Matrix{matrix Multiply Algorithm

Matrix{matrix multiplication is normally computed using one of the BLAS or a reinvention of part of the BLAS. Unfortunately, the BLAS were designed with small matrices in mind. When huge, well conditioned matrices are multiplied together, the BLAS perform like the blahs, even on vector machines. For matrices where the coe cients are well conditioned, Winograd's variant of Strassen's algorithm o...

متن کامل

A three-dimensional approach to parallel matrix multiplication

A three-dimensional (3D) matrix multiplication algorithm for massively parallel processing systems is presented. The P processors are configured as a "virtual" processing cube with dimensions pl, p2, and p3 proportional to the matrices' dimensions-M, N, and K. Each processor performs a single local matrix multiplication of size Mlp, x Nlp, x Wp,. Before the local computation can be carried out,...

متن کامل

Architecture-eecient Strassen's Matrix Multiplication: a Case Study of Divide-and-conquer Algorithms Architecture-eecient Strassen's Matrix Multiplication: a Case Study of Divide-and-conquer Algorithms

Many fast algorithms in arithmetic complexity have hierarchical or recursive structures that make eecient implementations on high performance computers with memory hierarchies non-trivial. In this paper we present our ndings on eecient implementation of Strassen's algorithmm17] for the ubiquitous operation of matrix multiplication as a model for a class of recursive algorithms. In comparison to...

متن کامل

Recursive Array Layouts and Fast Matrix Multiplication

The performance of both serial and parallel implementations of matrix multiplication is highly sensitive to memory system behavior. False sharing and cache con icts cause traditional columnmajor or row-major array layouts to incur high variability in memory system performance as matrix size varies. This paper investigates the use of recursive array layouts to improve performance and reduce vari...

متن کامل

A BSP Realisation of Strassen's Algorithm

An eecient BSP realisation of Strassen's matrix multiplication algorithm is described. 1 Strassen's Algorithm Let A and B be two n n matrices and consider the problem of computing C = A B. We can regard the matrices A; B; C as each composed of four n=2 n=2 submatrices. For example, ! If the submatrices of B and C are described in the same way then we have C ij = A i0 B 0j + A i1 B 1j for all i;...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1410.1599  شماره 

صفحات  -

تاریخ انتشار 2014